1 1 . Method for automatic community model generation based on uni-parity data, 

2 comprising the steps of: 

3 hypothesizing a subset S of set U, wherein for any pair of items in said 

4 subset S there exists a mathematical function C applicable to said pair of items so 

5 as to generate a correlation value and correlation relationship between any said 

6 pair of items in subset S; 

7 generating said correlation values by applying said function C to each of 

8 said pairs of items in said subset S; 

9 graphing G(S,E), wherein E is the edge set of said graph G with computed 

10 correlation values as weights; and 

1 1 mapping said graph G to one of its subgraphs McG so as to generate a 

12 community. 
13 

1 2. Method of claim 1, wherein said correlation relationship and said correlation 

2 value is defined by: 

3 Vp 9 q e Sc C/,C : Sx S -> [0,1] 

4 and wherein said correlation value is in the range of [0,1]. 

5 

1 3. Method for solving a community generation problem, comprising the steps of: 

2 converting documents to digital form and tagging said digitized 

3 documents; 

4 parsing said digitized and tagged documents to extract the transaction 

5 history vector for each individual; 

6 creating timelines of said transaction vectors so as to form a timeline map; 

7 determining the relevancy of said vectors; 

8 projecting said vectors along a time dimension so as to form a histogram; 

9 translating said vectors into groups of activities by histogram clustering; 

10 determining the local correlation between any pair of clusters in the 

1 1 timeline of two individuals; 

12 computing the global correlations between pairs of individuals; 



13 



13 converting data to a graph as a function of all individuals extracted from 

14 said documents and the correlation values between said individuals; 

15 generating models based on a search of all subgraphs with correlation 

16 values above a threshold; and 

17 outputting a group model. 

18 

1 4. Method of claim 3, wherein said step of parsing further comprises the step of 

2 applying the "one way nearest neighbor" principle. 

3 

1 5. Method of claim 4, wherein said "one way nearest neighbor" principle further 

2 comprises the following steps as applied to a money laundering problem: 

3 for every person's name encountered, the first immediate time instance 

4 is the first time instance for a series of financial activities; the second 

5 immediate time instance is the second time instance for another series of 

6 financial activities, etc.; 

7 for every time instance encountered, all the subsequent financial 

8 activities are considered as the series of financial activities between this time 

9 instance and the next time instance; 

10 financial activities are identified in terms of money amount; money 

1 1 amount is neutral in terms of deposit or withdrawal; 

12 each person's time sequence of financial activities is updated if new 

13 financial activities of this person are encountered in other places of the same 

14 document or in other documents; and 

15 the financial activities of each time instance of a person is updated if 

16 new financial activities of this time instance of the same person are 

17 encountered in other places of the same document or in other documents. 
18 

1 6. Method of claim 3, wherein said step of determining the relevancy of said 

2 vectors further comprises a step of focusing on "clusters" of vectors in said 

3 timeline map and ignoring scattered (i.e., non-clustered) vectors in said timeline 

4 map. 
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5 

1 7. Method of claim 3, wherein said step of translating said vectors into groups of 

2 activities further comprises solving a standard histogram clustering problem; and 

3 simplifying said standard clustering problem by virtue of all individuals 

4 sharing the same said timeline. 

5 

1 8. Method of claim 3, wherein said step of computing correlations between pairs 

2 of individuals further comprises computing the global correlation of all local 

3 correlations between pairs of individuals. 
4 

1 9. Method of claim 8, further comprising the step of computing local correlations 

2 by computing the correlation between two clusters corresponding to a pair of 

3 individuals on said histograms. 
4 

1 10. Method of claim 9, wherein said step of computing correlations between two 

2 clusters further comprises the step of computing the fuzzified correlation between 

3 jx x (t) and fy- } (t) 9 the financial transaction histogram functions of individual x and y 

4 in cluster i and j, respectively. 

5 

1 11. Method of claim 10, wherein said step of computing the fuzzified correlation 

2 between Jx t (t) and Jy } (t) further comprises the step of computing the maximum 

3 correlation value 
4 

5 S(W,) = max % £ ^ ^ > (' - '') 
6 

7 where ^^O^^.W^O 
8 

9 where (a-*) 2 

ii 

12 and where 2 ^w, , 

a — > > \a-b\ 
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12. Method of claim 8 wherein said step of computing the global correlation of all 
local correlations between pairs of individuals further comprises computing the 
dot product between two vectors as follows: 

C(x 9 y) = Cy (x) . Cx (y) = £ * , C(*„ y)C(y n x) 
where the vectors Cy(x) and Cx(y) are defined as 
Cy(x) =< C(x i9 y) 9 i = l,..., K > 
Cx(y) =<C(y n x),i = l,...,K> 

where C (x, , y) = max % x {g(x n y j)S (i 9 j)} 

SQJ) = e 2 °> 

and where 

(g(x/,y,) S(zJ)J=l,...,K} 

13. Method of claim 3 5 wherein said step of converting data to a graph further 
comprises obtaining a complete graph G(V, E), where V is the set of all the 
individuals extracted from the given collection of the documents, and E is the set 
of all the correlation values between individuals such that for any correlation C(x, 
y), there is a corresponding edge in G with the weight C between the two nodes x 
andy. 

14. Method of claim 3, wherein said step of generating models further comprises 
the step of identifying links as a graph segmentation based on a minimum 
correlation threshold value. 
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1 15. Method of claim 14, wherein said minimum threshold value is selected based 

2 upon a user's expertise. 
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